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We first give a characterization of the L -transportation cast- 
or information inequality on a metric space and next find some appro- 



priate sufficient condition to transportation cost-information inequal- 
ities for dependent sequences. Applications to random dynamical sys- 
i-C ' tems and diffusions are studied. 
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1. Introduction and questions. Let (E, d) be a metric space equipped 
with cr-field B such that d(-, •) is B x ,6-measurable. Given p > 1 and two 
probability measures \i and v on E, we define the quantity 

i/p 



i (1-1) W*(n,v) = w£( 1 1 d(x,yfdTT(x,y) 



where the infimum is taken over all probability measures tt on the product 
space E x E with marginal distributions /x and v [say coupling of (//, 
This infimum is finite as soon as /x and v have finite moments of order p. 
This quantity is commonly referred to as L p -Wasserstein distance between /i 
^ \ and v. When d is the trivial metric (d(x,y) = t x ^ y ), 2Wf(fi,u) = \\fi — 

the total variation of /i — v. 

The Kullback information (or relative entropy) of v with respect to /i is 
defined as 

i (1.2) Riyjii) = / f lo § ^ du > if v « ^ 




I +oo, otherwise 



We say that the probability measure [i satisfies the L p -transportation 
cost-information inequality on (E, d) if there is some constant C > such 
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that for any probability measure v, 

(1.3) Wf(n,u)<V2CH(u/p). 

To be short, we write \x G T P (C) for this relation. 

The cases "p = 1" and "p = 2" are particularly interesting. That T\ (C) 
are related to the phenomenon of measure concentration was emphasized by 
Marton [10, 11], Talagrand [18], Bobkov and Gotze [2] and amply explored 
by Ledoux [8, 9]. The T 2 (C), first established by Talagrand [18] for the 
Gaussian measure, has been brought into relation with the log-Sobolev in- 
equality, Poincare inequality, inf-convolution, Hamilton-Jacobi's equations 
by Otto and Villani [15] and Bobkov, Gentil and Ledoux [1]. Since those 
important works, a main trend in the field is to put on relations of T P (C) 
with other functional inequalities (of geometrical nature in particular). In 
this paper we shall study three questions around the following problem go- 
ing somehow to the opposite direction: how to establish the "T p (C)" without 
reference to other functional inequalities in various concrete situations? 

To raise our first question, let us mention the following: 



Theorem 1.1 (Bobkov and Gotze [2]). n satisfies the L 1 -transportation 
cost-information inequality on (E, d) with constant C > 0, that is, [i G T\(C), 
if and only if for any Lipschitzian function F : (E, d) — > R, F is [i-integrable 
and 

(1.4) j e A ^-< F >^<expfyC||F||£ ip ) VAgK, 

where (F)^ = J E Fdfi and 

\F(x)-F(y)\ 

F Lip = SUp l —±-f < +00. 

x^ y d(x,y) 

In that case, 

^{F-{F)^>r)< exp (- 2<J ^ ) V r > 0. 



It might be worthwhile to recall the classical Pinsker-Csizsar inequality 
which is the starting point of many recent works. By the coupling character- 
ization of the total variation distance || • ||tv ; the Pinsker-Csizsar inequality 

\W - mIItv < \J\H{yJ^) 

says that w.r.t. the trivial distance d(x, y) = t x ^ y on E, any probability mea- 
sure n satisfies the L 1 -transportation cost-information inequality with the 
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sharp constant C = 1/4. And by Theorem 1.1, the Pinsker-Csizsar inequal- 
ity for the trivial distance follows from the classical well-known inequality: 
for a real bounded random variable £ with values in [a, b], 

\2- 



Ee^<exp(^l:) 



(and vice versa). 

We now do a simple remark. Assume that fi E T\(C) or, equivalently, 

(1.4) . Let 7(cZA) be the standard Gaussian law A/"(0, 1) on M. We have for 
any Lipschitzian function F on E with (F)„ = and ||-F||Lip < L and oGl, 

^ exp {^F^j d^ = j E j^ e aXF 7 (d\) dp, < jf exp ^ a 2 A 2 ) 7 (dA) 

1 .X J_ 

Vl-a 2 C' 1 2 < 2C" 
+oo, otherwise. 

Applying it to F{x) :=d(x,x ) — J d(x, Xq) dfi(x), we obtain 

j e cd 2 (x,x ) d ^ <+OQ VcG / j_y 

In particular, for all S E (0, ^) we have, 

(1.5) / y e Sd2( - x ' y) d/i(x) dfi(y) < +oo. 
That naturally leads to the following questions: 

Question 1. Will the Gaussian tail (1.5) be sufficient for the .^-transporta- 
tion cost-information inequality of \xl 

The second question is about dependent tensorizations of the T P (C). Let, 
for example, P™, the law of a homogeneous Markov chain (Xk(x))i<k<n on 
E n starting from x E E, with transition kernel P(x,dy). 

Question 2. Assume that P(x,-) E T p (C) for all x E E. Where is the 
appropriate condition under which P™ satisfies the L p -transportation cost- 
information inequality w.r.t. the metric 

/ n \ Vp 

di p {x,y) ■■= \J2d(xi,yi) p j ? 

The same question can be raised for the law of an arbitrary dependent 
sequence (Xk)i<k<n- When (Afc)!<fc< n are independent, this question has a 
rapid and affirmative answer, see [8, 9] and references therein. 
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In the dependent case, when d is the trivial metric, and p = 1 (and d^ be- 
comes the Hamming distance on E n ), Marton [10] generalized the Pinsker- 
Csizsar inequality to the law of the so called "contracting" Markov chains: 

(1.6) \ sup ||Pi(-/yi-i)-Pi(-/x i _i)|| T v<r<l. 

(asi-lilte-l) 

Her approach is based on coupling ideas, natural by the definition of the 
involved Wasserstein distance. Her results have been strengthened by Mar- 
ton [11, 12] and Dembo [4] and have been generalized to uniform mixing 
processes by Samson [17] and Rio [16]. 

However, the trivial distance does not reflect the natural metric struc- 
ture of the state space E to which usual Markov processes such as random 
dynamical systems or diffusions are related and that is why the uniform 
mixing assumption was made in her work (and also in [17]). This is a main 
motivation for Question 2. 

For the L 2 -transportation cost-information inequality T2(C), recall that 
Talagrand [18] proved that the standard Gaussian law 7 = JV(0, 1) satisfies 
22(C) on K w.r.t. the Euclidean distance with the sharp constant C = 1 and 
found that ?2(C) is stable for product (or independent) tensorization. To our 
knowledge the Markovian tensorization of 72(C) has not been investigated 
in the literature. 

Since the works of Otto and Villani [15] and Bobkov, Gentil and Ledoux 
[1], we know that ?2(C) follows from the log-Sobolev inequality in the frame- 
work of Riemannian manifolds. Indeed, all known T2(C)-inequalities up to 
now can be derived from the log-Sobolev inequality. An important open 
question in the field is whether T2 (C) is strictly weaker than the log-Sobolev 
inequality. Hence, it would be interesting to investigate the following ques- 
tion: 

Question 3. How do we establish the ?2(C) -inequality in situations 
where the log-Sobolev inequality is unknown? 

This paper is written around those three questions and it is organized as 
follows. The next section is the general theoretical part of this paper. After 
noticing the stability of T P (C) under Lipschitzian map and under weak con- 
vergence in Sections 2.1 and 2.2, in Section 2.3 we prove that condition (1.5) 
is, in fact, sufficient for the L 1 -transportation cost-information inequality, 
solving Question 1. In Section 2.4 we revisit the coupling method of Mar- 
ton and show that it actually works for dependent tensorization of T P (C) 
for 1 < p < 2, under a contraction assumption [see (CI) in Theorem 2.5] 
close to Marton's (1.6). Section 2.5 is devoted to revisit the McDiarmid-Rio 
martingale method which allows us to obtain a much more subtle condition 
(Cl') than (CI) for tensorization of Ti(C) in Theorem 2.11. 
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Sections 3 and 4 contain several applications of the general results in 
Section 2 to random dynamical systems and diffusions which are our main 
motivation for Question 2. 

In Section 5, quite independent, we present a direct approach of T^iC) 
for diffusions, by means of the Girsanov transformation, with respect to the 
usual Cameron-Martin metric or L 2 -metric. 

The reader may consult the recent monograph by Villani [19] for an ex- 
tended (analytical and geometrical) treatment on transportation. 

2. Criteria for T P (C). Throughout this paper let (E,d) be a metric 
space equipped with <7-field B such that d(-,-) is B x 23-measurable; and 
when (E, d) is separable, B will be the Borel u-field. 

2.1. Stability under push-forward by Lipschitz map. We begin with the 
stability of T p {C) under Lipschitzian map and under weak convergence, 
which will be useful later. 

Lemma 2.1. Assume that fi£T p (C) on [E^e) and (F,dp) is another 
metric space. If : (E, dE) — ► (F, dp) is Lipschitzian, 

d F (^(x),^(y))<ad E (x,y) Vx,y££, 

then /i^/iof- 1 eT p (Ca 2 ) on (F,d F ). 

Proof. Let v be a probability measure such that H(y/p) < +oo. The 
key remark is 

(2.1) H{v/p) = M{H(v/ fJl );voW 1 = D}. 

To prove it, putting vo(dx) := ^(^(x))fj,(dx), we see that vq o v^ -1 = v. We 
have for any v so that v o 'I' -1 = z?, 



where v y := v {•/'$> = y) and [i y := fJ>(-/^ = y) are, respectively, the regular 
conditional distribution of u, fj, knowing ^ = y. Hence, (2.1) follows. 
With (2.1) in hand, the rest of the proof is easy and is omitted. □ 

2.2. Stability under weak convergence. 

Lemma 2.2. Let (E,d) be a metric, separable and complete space (Pol- 
ish, say) and (/i n ,/i) ne ^ a family of probability measures on E. Assume that 
£ T P (C) for all n EN and /i n — > fi weakly. Then fj, £ T p (C). 



H{v/ij)=H{vo/ij)+ \ dv{y)H{v y /ny), 



PROOF. Recall at first two facts (see, e.g., [19]): 
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1. If fx n — > n and v n — > v weakly, then liminfn^oo W p (fi n , v n ) > W p (fi, v). 

2. If /i n — > fi weakly and {d(x, xq) p , /i n (dx)} is uniformly integrable, W p (n n , 
/x)-0. 

What one needs to prove is 

W^ffi,fi)<2C J f log fdfi 

for all / such that f \x is a probability. By approximation (and using fact 2 
above), it is sufficient to prove the result for continuous / so that 1/N < / < 
N over E for some N > 1. Let a n = J fd[i n and we have by "/i n £ T p (C)" 

Wtl^nn) <2C I (L\ \o g (L\ dfIn = 2£ / /i g/^ n . 

Since /i n converges weakly to /i, a n converges to //(/) = 1, and one can pass 
to the limit in the right-hand side of this last inequality. For the convergence 
of the left-hand side, it is enough to apply the lower semi-continuity of W p . 
□ 



2.3. Characterization of T\{C) by " Gaussian tail? We present here a 
characterization of T\(C), based on the Bobkov and Gotze [2] result, that 
is, some Gaussian integrability property. 



Theorem 2.3. A given probability measure fj, on (E,d) satisfies the 
^-transportation cost- information inequality with some constant C on (E,d) 
if and only if (1.5) holds. In the latter case, 



(2.2) C < - sup 



2 {(klfy/* 



*igV(2*)! 



dii(x)dn(y) 



l/k 



< +oo. 



Proof. It is enough to show the sufficiency. By Bobkov-Gdtze's Theo- 
rem 1.1, it is enough to show that there is some constant C = C{5) verifying 

(2.2) such that 

(2.3) Ee AF ^<exp(^^j VAgR, 
for all F:E -> R with ||F|| 

Lip 5; 1 and E.F(£) — 0, where £ is a random 
variable valued in E with law fi, defined on some probability space (0,7-", P). 

Let £' be an independent copy of £, defined on the same probability 
space (fij.FjP). Since EF(£') = 0, by the convexity of the x — ► e x , we have 
E(e~ XF ^) > 1. Consequently, noting that E[F(£) - F(0] 2fc+1 = 0, we have 

K(e XF ^)<K(e XF ^)E(e- XF ^) 

= Ee HF(o-F(e)) 
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^ X 2k E[F(Q-F(Q? k 

- h w ' 

Hence, putting 



C:=2sup 
k>l 



(2k)\ 



we get 

-<^i+££-(§)*=«p(^ 

Thus, for (2.3), it remains to estimate C defined above. Since 



k 



Ed($,?) 2k <kl-U) Eexp(^,0 2 ), 
we get 

C < ^^P(§^) 1/fc • [lEexp(5(d(e,0 2 ))] 1/fc < +00 
the desired estimate (2.2). □ 

Remark 2.4. For comparison notice that the Bernoulli measure fi on 
{0,1} with E (0,1) satisfies Xi(l/4) w.r.t. the trivial metric, but does 
not satisfy T p (C) for any p > 1 (see [7]). Hence, any probability measure \x 
which is not a Dirac measure on E does not satisfy T p (C) for any p > 1 w.r.t. 
the trivial metric. Another example for illustrating difference of T\ and T2 
inequalities is the following. 

Let n = (j)(x) 2 dx on R with < (p £ Co°(^) (compact support). It satisfies 
always Ti(C) w.r.t. the Euclidean d(x,y) := |y — x| by the theorem above. 
But if the support of (i (or of (j)) has two connected components I\ , I2 with 
dist(/i,/2) > 0, then the corresponding T2(C) fails. In fact, if contrary to 
/U E 72(C), then by [15] or [1] the following Poincare inequality holds: 



Var M (/)<C f f 2 dix V/EC, 



Choose now / smooth enough and equal to 1 on l\ and on Then the 
right-hand side in the Poincare inequality is 0, whereas the variance of / 
will be non zero so that the Poincare inequality cannot hold, neither Ti{C\ 
This example shows, moreover, that T\(C) on R does not imply the 
Poincare inequality, unlike T2(C). 
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The next two sections are dedicated to the tensorization of T p (C) for 
dependent sequences. 

2.4. Weakly dependent sequences: Marion' 's coupling revisited. Let P be 
a probability measure on the product space (E n , B n ), n>2. For any x E E n , 
x l := (xi, . . . ,Xi). Let Pi{-/x l ~ l ) denote the regular conditional law of xi 
given x 4-1 for i > 2 (assume its existence). By convention Pi(-/x°) is the 
law of x\ under P, where x° = xq is some fixed point. When P is Markov, 
then Pi(-/x l ~ 1 ) = Pi(-/xi-\) is the transition kernel at step i — 

Our aim in this section is to extend transportation cost-information in- 
equalities (1.3) for a probability measure P on (E n ,di p ), where 

/ n \ 1/P 

di p (x,y) := f Y^d(xi,yi) p j . 

Theorem 2.5. Let ¥ be a probability measure on E n , and 1 < p < 2. 
Assume that P^/x^ 1 ) £ T p (C) on (E,d) for all i > I, x^ 1 in E l ~ x (E° : = 
{xo})- If 

(CI) there exist aj > with r p := Y^=ii a j) p < 1 such that 

(2.4) [iy p d (P J (7x J - 1 ),P i (7x J - 1 ))] p <^(a J P ( i p (x i _ J ,x i _ J ), 

i=i 

for all i>l, in E 1-1 , then for any probability measure Q 

on E n , 

Wp p (Q, P) < ^— — ^2Cn 2 /P~ l H(^/¥). 

Proof. The proof is similar to the one used for the Hamming distance 
by Marton [10], however, we have to use the assumption Pj(-/a; i_1 ) 6 T P (C) 
instead of Pinsker's inequality. Assume that i?(Q/P) < +oo (trivial other- 
wise) . 

Let Qi(-/x l ~ ) be the regular conditional law of x% knowing for % > 2 
and Qi(-/x°) the law of xi, both under law Q. We shall use the Kullback 
information between conditional distributions, 

H^x 1 - 1 ) = HiQii-fx^yPiVS*- 1 )), 

and exploit the following important identity: 



(2.5) 



F(Q/P) = V/ H i (± i - l )dQ(±). 
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The key is to construct an appropriate coupling of Q and P, that is, two 
random sequences X n and X n distributed according to Q and P, respec- 
tively, on some probability space (fi, P). 

We define a joint distribution C(X n ,X n ) by induction as follows. Add 
artificially time and put Xq = Xq = x° = x°, the fixed point. Assume that 
for some i, 1 < i < n, is already defined. We have to de- 

fine the joint conditional distribution C(Xi, Xi/X % ~ 1 = x* -1 , JP -1 = x 4 " 1 ), 
where (x* -1 ^*" 1 ) is fixed (but arbitrary). 

Given e > so small that r(l + e) < 1, this distribution will have marginal 
laws 

CiX./X'- 1 = x'- 1 ,^- 1 = x*- 1 ) = QiUx'- 1 ) 

and 

CiXi/X^ 1 = x*- 1 ,^" 1 = x'" 1 ) = Pii-fx'- 1 ) 

so as to satisfy 

EidiX^Xif/X 1 - 1 =x i -\X i ~ 1 = x i ~ 1 ) 
<(l + e)W*(Q l (-/i i - l ),P l (-/x l - 1 )) p 

for all x* _1 ,x* _1 in E l ~ l . Obviously, X n ,X n are of law Q, P, respectively. 
By the triangle inequality for the VFp-distance, 

^{diX^Xif/X^ 1 = x*" 1 , = x 4 " 1 ) 

<(l + £)[^(Q i (7x J - 1 ),P i (7x J - 1 ))+^(P l (7x i - 1 ),P l (./x i - 1 ))] p . 

Using the elementary inequality that (x + y) p < aP x p + b p ~ 1 y p (for p > 1 
V x, y > 0) where a, b > 1 such that l/a + l/6 = l,we have by the assumptions 
Pii-fx 1 - 1 ) £T p (C) and (CI) 

E{d p {X i ,X i )/X i - 1 =x i - 1 ,X i - 1 =x i - 1 ) 

"i-i 



(2.6) < (l + e)l V2CH i (x^ 1 ) + 



I i/p\ v 

f V2CH i (x 1 - 1 ) + Yl (ajytffa-jtXi-j) 



i-1 

< (l + e))^- 1 ^^^- 1 )]^^^- 1 ^ (ajfcFixi-^Xi- 
\ j=i 



By recurrence on i, this entails that E(F(Aj, Xi) < +oo for all i = 1, . . . , n. 
Taking the average with respect to £(A" i_1 , X % ~ 1 ), summing on i and using 
the concavity of the function x — > x p / 2 for p 6 [1, 2], we get by (2.5) and (2.6) 
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n \P/ 2 n i-1 

Ee^(^ _1 ) + — EE <$®<F(Xi-j,z 

r— : I n , J 

i=l / i=l j=l 

p/2 n-\ n 

+ — E E *(**.**) E 

k=l i=k+l 









I n 








^ n 




_ r p 



rW 1 < 1, 



Wp p (<Q>, P) < f ~rz. — — ) \l2Cn 2 /P- l H 



1 - rPbP- 1 



Optimizing on (a,b), we get the desired inequality. □ 

Noting that for a real function / on E n , \\f\\up(di ) < o if and only if for 
every k = 1, . . . , n, 

(2.7) \fk{xk) ~ fk(Vk)\ <ad(x k ,y k ) Mx kl y k £E, 

where f k (x k ) is the function / w.r.t. the kth variable while the others are 
fixed. Then we get by Theorem 1.1, 

Corollary 2.6. Under the assumption of Theorem 2.5 for p= 1, for 
any real function f on E n satisfying (2.7), 

CX 2 a 2 n 
.2(1 -r) 

In particular, for any t>0, 

t 2 (l-r) 2 



E P e A (/- E ^<exp(^f^ V A G 



P(/ > E P / + t) < exp 



2nCa 2 



Remark 2.7. The condition P^/x^ 1 ) G T p {C) is our starting point for 
tensorization of the T p {C) and it is verified for many interesting examples, 
such as the stochastic differential equation (SDE) (4.1) or random dynamical 
systems or Gibbs fields. Condition (CI), meaning that the dependence of 
the present on the past is very weak, is a crucial condition. Indeed, when 
d(x,y) = txj^y, p=l and P is Markovian, (CI) is equivalent to (1.6), and 
Theorem 2.5 is exactly the result of Marton mentioned in the Introduction. 

Remark 2.8. That the constant C n for the Ti-inequality of P^ increases 
linearly on dimension n is natural in the point of view of the Hoeffding in- 
equality in Corollary 2.6. This is completely different from the case of the 
T2-inequality, for which it is hoped that the TVconstant remains indepen- 
dent of dimension n, as seen for the independent tensorization of 72(C) by 
Talagrand [18] or its extension Theorem 2.5. 
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Remark 2.9. Under P^-fx*' 1 ) G T p (C) and (2.4) but without the con- 
traction condition that r p := J2j( a j) p < 1, we have always ¥ x G T p (C n ) on 
E n w.r.t. di p for some constant C n (but the crucial estimate of C n in Theo- 
rem 2.5 is lost). We give only the proof of this fact for p = 1. 

Indeed, consider the nonnegative nilpotent lower triangular matrix A = 
(aij), where aij = a%-j if i > j and otherwise. For any given S G (0, 1), there 
is always a (positive) vector z = (z\, . . . , z n ) such that Zi > 0, J2i Zi = l and 

n 

(zA) k = z i a i-k<5z k Vfc = l,...,n. 

i=k+l 

Then by (2.5) for p = l, we have by Jensen's inequality, 

n 

1=1 

(n n i— 1 \ 

z i V.\/2CH(x i - 1 ) +Y,ZiJ2 a J E • ) 

i=l i=l j=l ) 



<(l + e) 



i n— 1 n > 

k \ i=l fe=l t=fc+l / 
n-1 \ 



< (l+e)( j2Cmaxz i H(Q/F) + Y t 5z k Ed(X k ,X k )j, 



where it follows that 



d u ,„ m , , 1 



^i !l (Q,P) < 7^ 7s— j2Cm a x Zi H(Q/F). 

(1 - 5) mini Zi V 4 

When Zi = l/n, the best choice of (5 is r, and this inequality becomes Theo- 
rem 2.5. 



2.5. 2~i(C) for weakly dependent sequences: McDiarmid-Rio' 's martingale 
method revisited. The last inequality in Corollary 2.6, applied to F{X\, . . . , 
X n ) = J2k=i f(Xk) and the trivial metric d, where (X k ) are independent and 
||/(-Xfc)||oo < a, becomes exactly the sharp Hoeffding inequality (see [13]). 
But when it is applied to F(X\, . . . ,X n ) = f(X n ), it does not furnish the 
good order of n for n large. As this last question is important for the ?i(C) of 
the the invariant measure, we give now a very simple proof of the following: 

Proposition 2.10. Let (E,d) be a Polish space. Let P(x,dy) be a 
Markov kernel on E such that: 



(a) P(x, •) G T\{C) for every x G E; 
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(b) Wf(P(x, •); P(x, •)) < rd(x, x), for every x,x in E and some r < 1. 

Then there is a unique invariant probability measure fx of P and it satisfies 
Ti(Coo) as well as P n (x,-) Vn> 1, where = C(l - r 2 ) -1 . 

Proof. When (£7, a!) is Polish, the space M^(E) of probability measures 
v on E such that / xq) p dv(x) < +oo, equipped with the Wasserstein 
metric W p (-,-) is a metric complete separable space (see [19]). Since v G 
M\(E) =^ z^P G M\(E) by (a) and, condition (b) implies (in fact, equivalent 
to) 

W x {y x P,v 2 P) < rW x (v x ,v 2 ) V^,^ € M\(E), 

hence, by the fixed point theorem, there is one and only one P-invariant 
measure fi G Mf (E), and P n (x, •) — > [i in the metric Wi for any initial point 
x G E. The last point shows also that /i is the unique invariant probability 
measure of P [without the restriction that [i G M\(E)\. 
Since 



sup 

/: ]|/|[li p <1 



condition (b) is also equivalent to 

||P/||Li P <r||/|| Lip V/. 
Thus, \\P N f \\up < r N \\f\\up for all N > 1. Now given a Lipschitzian function 



/, we have by (a) and Bobkov-Gotze's Theorem 1.1, 



P n (e / ) < P"^ 1 



exp P/ + 



cWfllhp 



<p 
< . . . 



n-2 



exp P 2 / + 



pn, , g H/llLip , C\\Pf\\j ip 

< exp P f H — - H - 



+ ••• + 



ciip»-VIIL 



Lip 



<exp (P n f + 



C 



2 

Lip 



2(1 -r 2 ) 



In other words, for every x £ E, P n (x, ■) G Pi (Coo), where Coo is given in the 
proposition. Letting n — ► oo, we obtain the desired result for /i by Lemma 2.2. 

□ 



We now use the martingale method of McDiarmid [14] (in the independent 
case) and Rio [16] (in the uniform mixing case) for extending the argument 
above to the process-level law P. 
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Theorem 2.11. Let ¥ be a probability measure on E n satisfying Pi(- / 
x % ~ 1 ) gTi(C) (ii,x % ~ 1 ) in Theorem 2.5. Assume instead of (CI) that 

(CI') there is some constant S > such that for all real bounded Lips- 
chitzian function f(x k+ i, ■ ■ ■ ,x n ) with \\f\\up(di ) — 1> f or a ^ x G E n , y k G E, 

\E F {f{X k+1 , . . . , X n )/X k = x k ) - E F (f{X k+1 , . . . , X n )/X k = (x k ~\y k ))\ 
< Sd(x k ,y k ). 
Then for all function, F on E n satisfying (2.7), 



{2.1 



E, e ^-^)< e x P ( CA " + 2 S) °") VA £R . 



Equivalently, PeTi(C n ) on (E n ,di 1 ) with 

C n = nC{l + Sf. 

Proof. We may assume without loss of generality that a = 1. Let 
(M k = E F (F/X k )) k>0 , where Mo = EpF. It is a martingale. It is enough 
to show that for each k, 

Ep(e A(M fc -M fc , 1 ) /X ^l ) < exp ^ C ' A2 ( 1 2 + 5 ) 2 y 

To this end, note at first by Pi{-/x l ~ l ) G Ti(C) and Theorem 1.1, 

Ep ( e A(M fc -M fc _ l)/X fc-l)< exp 

where 



CA 2 6| 



6 fc := sup 



\M k {x k ) - M k {x k -\y k )\ 
d{x k ,y k ) 



But M k (x k ) = J F(x k ,x k+1 , . . .,x n )F(dx k+1 ,.. .,dx n /x k ), writing = (x k+1 , 
we have 

\M k {x k ) - M k {x k -\y k )\ 

< I (F(x k ,x n k+1 ) - F(x k -\y k ,x n k+1 )ndxl +1 /x k ) 

I F(x k -\y k ,xt +l ){ndx n k+l /x k )-ndx n k+1 /x k -\y k )) 



< d(x k ,y k ) + Sd(x k ,y k ). 
Hence, b k < (1 + S), the desired result. □ 
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Remark 2.12. When d{x,y) = l x ^ y , P^/x 1 ' 1 ) G Ti(l/4), and this re- 
sult is essentially due to Rio [16]. Using a different condition than (CI'), 
he essentially proved that the constant S in condition (Cl') verifies S < 
2^°2 =1 <^>j, where (f>j is the uniform mixing coefficient of the sequence (X n ). 
Our proof above is, in fact, inspired by his work. 

Remark 2.13. If the condition (Cl) is viewed as a backward type, then 
(Cl') may be seen as a forward type. Indeed (Cl') is equivalent to 

Wf h (¥(dx^ +1 /x k ,x k - 1 ),¥(dxl +1 /y k ,x k - 1 )) < Sd(x k ,y k ). 

It means intuitively that the present does not influence a lot the future of the 
process P. In concrete situations (Cl') is often weaker than (Cl) with p = l. 
For example, let (P x ) be a uniformly ergodic (Doeblin recurrent, say) Markov 
chain with transition P(x,dy) in the sense that r n := sup xgE ||P n (x, •) — 
A* 1 1 TV "~ ► 0. As 2<f> n < r n , we have by Rio's estimate above, 

oo 

S<£sup||P n (x,-)-/i||TV, 

n= lx&E 

which is finite. But Marton's condition (1.6) or (Cl) means (1/2) sup xg £ ||P n (x, 
•) — /u||tv < r ™ for all n> 1. See also Example 3.3. 

It would be very interesting to generalize Theorem 2.11 to T2(C). 

3. Application: study of Ti(C) and T2(C) for random dynamical sys- 
tems. 

3.1. T\{C). Let E be a complete connected Riemannian manifold equipped 
with the Riemannian metric d. Consider now the nonlinear random per- 
turbed dynamical system valued in E, 

(3.1) X (x):=xeE, X n+1 (x) = F(X n (x),W n+ i), n>0, 

where the noise (W n ) n >o is a sequence of i.i.d. r.v. valued in some measurable 
space (G, G), defined on some probability space (O, J 7 , P), and F(x,w) :E x 
G — ► E is measurable. Denote by P(x,dy) the law of F(x,W\), and the 
following: 

Proposition 3.1. Assume that there exists S > such that 

(3.2) supE ( e ^(^).*X^)) 2 ) < +oo. 

If there exists < r < 1 such that 



(3.3) E(d(F(x,Wi),F(x,Wi)))<rd(x,x) Vx,£e£, 
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or more generally for some constant S > 0, 

oo 

(3.4) ^E(d(X n (x),X n (x)))<Sd(x,x) Vx,xe£, 

n=l 

then there is some constant C > such that for any n > 1, for every proba- 
bility measure Q n on E n , 

wf h (Q n ,P™) < Vc^WWM), 

where P™ is the law of (Xk(x))i<k< n on E n ■ 

Proof. By Theorem 2.3, condition (3.2) is equivalent to u P(x,-) £ 
Ti(C) Vx£ E." Notice that (3.3) is equivalent to (CI) (with p= 1) in The- 
orem 2.5, and (3.4) implies trivially (CI') with the same constant S in 
Theorem 2.11. Hence, this proposition follows from Theorems 2.5 and 2.11. 

□ 

Remark 3.2. If the largest Lyapunov exponent in L 1 given by 

Amax(-^ ) := hm sup — 

n -^°°\xjtx d(x,x) J 

is strictly smaller than 1, then condition (3.4) is verified. 

Example 3.3 (ARMA model). To see the difference between (CI) in 
Theorem 2.5 and (CI') in Theorem 2.11, let us consider the ARMA model 

X (x) = x, X n+1 (x) = AX n (x) + W n+ i 

in E = M. d , where A £ Mdxd (the space of d x d matrices) and (W n ) is a 
sequence of i.i.d. r.v. with values in G = M d . This model is a particular 
case of the general model above with F(x,w) = Ax + w. Condition (CI), 
equivalent to (3.3), means that r = \\A\\ := sup{\Ax\; \x\ < 1} < 1, however, 
(Cl') for this linear model is equivalent to 

r sp (A) := max{|A|; A is an eigenvalue in C of A} = A max (L 1 ) < 1, 

which is much weaker. This last condition is a well-known sharp sufficient 
condition for the ergodicity of this linear ARMA model (X n ). 



Remark 3.4. For this model, the known results mentioned in the Intro- 
duction cannot be applied, for the uniform mixing condition is, in general, 
not satisfied when E is noncompact. For example, the ARMA model with 
A^O and W\ unbounded is never uniformly mixing. See [22]. 



16 H. DJELLOUT, A. GUILLIN AND L. WU 

3.2. T2(C). Consider a particular case of the preceding model 

(3.5) X {x)=x, X n+1 (x)=f(X n (x)) + a(X n (x))W n+1 , 

(the discrete time SDE), that is, F(x,w) = f(x) + o~(x)w, where E = M d , 
G = W 1 , f : R d -> R d , a : R d -> A4 rfX n (the space ofdxn matrices) and the 
noise (W n ) n gz is a sequence of i.i.d. r.v. with values in M n such that EVFi = 0. 
Assume that: 

(i) ¥ w := F(Wi €■)€ T 2 {C) on R n w.r.t. the Euclidean metric; 

(ii) \a(x)w\ <K\w\ \f(x,w) eR d xl"; 
(hi) for some r G [0, 1), 

(3.6) - f(x)\ 2 + m^{x) -d(x))Vl/i| 2 <r|x-x| Vx,xGM d . 

Notice that conditions (i) and (ii) imply that P(x, •) £T2(CK 2 ) for all 
x £ by Lemma 2.1; and condition (iii) implies (CI) with the same r 
for p = 2. Hence, by Theorem 2.5, P™ G T 2 (Ci^ 2 /( 1 ~ r ) 2 )- Tnat yields, by 
Bobkov, Gentil and Ledoux [1], the following: 

Corollary 3.5. For the model (3.5) above assume conditions (i)— (iii). 
Then P" G T2(CK 2 / (1 — r) 2 ) and for any measurable function F(xi, . . . , x n ) G 
L 1 ((R d ) n ,P"), 

Eexp(pQ J F(X 1 (x), . . . ,X n (x))) < exp(pEF(X 1 (x), . . . ,X n (x))), 

where 

(l- r ) 2 ( n \ 

P ■= CR 2 » W^i, • • • ,0 := _ Jnf ^ ( F(x + y) + \ \Vk? } ■ 



.ye 



fc=i 



As noted in [1], several estimates of Laplace integrals are the consequence 
of the functional inequality version of the T-z{C) above. For instance, Corol- 
lary 6.1 in [1] says that for any convex function F on (M d ) n , 



exp yp 



k=i 



< exp(pE P nF). 



Remark 3.6. Consider the Lyapunov exponent in L 2 , 

Ed(X n (x),X n (x)) 2 \V n 



Amax(-^ 2 ) := lim ( sup 



x^x d\X)X 

Obviously, (3.6) implies A max (£ 2 ) < 1. It is then natural to ask whether 
P(x, •) G T 2 (C) Vx plus A max (L 2 ) < 1 do imply "P™ G T 2 (K)" for some con- 
stant K independent of n (for which we have no answer unlike for T%). Notice 
that for the ARMA model, A max (L 2 ) = A max (L 1 ) = r sp (A). 
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4. Application: study of Ti (C) for paths of SDEs. Let us give here an 
application of Theorem 2.3 to SDE. Consider the SDE in M. d , 

(4.1) dX t = o(X t ) dB t + b(X t ) dt, X = x€ R d , 

where o : M. d — > A4dxm b : M. d — > M. d and is the standard Brownian motion 
valued in lR n defined on some well filtered probability space (O, T , (^),P). 
Assume that a, b are locally Lipschitzian and for all x,y £ M d , 

(4.2) sup |KaO|| H S<A (y-x,b(y)-b{x))<B(l + \y-x\ 2 ), 



where ||<t||hs : = too 1 is the Hilbert-Schmidt norm, (x,y) is the Euclidean 
inner product and \x\ := V (x,x). It has a unique nonexplosive solution de- 
noted by (X t (x)) whose law on the space C(]R + ,lR d ) of Revalued continuous 
functions on ]R + will be denoted by F x . 

Corollary 4.1. Assume the conditions above. For each T > 0, there 
exists some constant C = C(T, A, B) independent of initial point x such that 
¥ x satisfies the Ti(C) for every x G M. d , on the space C([0,T},M d ) of R d - 
valued continuous functions on [0,T] equipped with the metric 

dT(7i,72):= sup |7i(t) -72(*)|. 
te[o,T] 

Proof. Let (Bt),(Bt) be two independent Brownian motions defined 
on some filtered probability (fi,^ 7 , (J-i),P) and X t (x), X t (x) strong solutions 
of (4.1), respectively, driven by (B t ),(B t ). Put 

X t := X t (x) - X t (x), b t := b{X t (x)) - b{X t {x)) 

a(.) ■= oo\-), a t := a(X t (x)) + a(X t (x)) 

ft rt 



L t := f o{X t {x))dB t - f o{X t {x))dB t . 
Jo Jo 



Then 



X t = L t + I b s ds. 



By Theorem 2.3, it is enough to show that there exists some positive constant 
6 = 5(T,A,B) such that 

(4.3) Eexp(<5 sup \X t \ 2 ) < +oo. 

Let f(x) := /t(|x|), where h £ C°°(]R) is pair and such that h(r) = r for r > 4 
and 

h(r)>r, 0<ti{r)<lAr, 0<ti'(r)<l We [0,4]. 



18 H. DJELLOUT, A. GUILLIN AND L. WU 

Consider Y t := (1 + f(X t ))e-^, where > is a constant to be determined 
later. By Ito's formula, 

dY t = e-& J2 4 j didjf(X t ) + (Vf{X t ), b t ) j dt - 0Y t dt + dM t 
_p t fl „ ^ ^(X t ,a t X t ) i |^tra t {X t ,a t X t 



-f (l t) S t ) - /3(1 + fr(|X t |))) + dM tj 

where (Mt) is a local martingale (M^) with Mq = 0, whose quadratic varia- 
tional process [M] is given by 

[M] t = f e-^ s {Vf{X s )ra s yf{X s ))ds<2A 2 f e ~^ds<^. 
Jo Jo P 

Using our condition (4.2), we see that Yt < 1 + /i(0) + Mt once if 

/3>max{0,2^ 2 + 5}. 
Fix such a /3. For any A > 0, using the exponential martingale, 

exp(AMt-y[M]tY 

(Novikov's condition is satisfied) and Doob's maximal inequality [applied to 
the positive submartingale exp(AMj/2)], we have 

'X 2 A 2 ' 

Hence, by Chebychev's inequality and an optimization of A, we get 



Ee A(su Pt < T y t -i-h(o)) < Esupe AMt < 4(Ee AMr ) 2 < 4exp 



supy 4 >l + /i(0)+r ) <4exp( Vr>0. 



t<T 



AA 2 J 



Consequently, 



Eexpl asupY t 2 ) < +oo, if 0<a< 



t<T 



AA 2 ' 



Hence, (4.3) is true for all 5 G (0, e - ^^), where (3 > max{0, 2A 2 + B}. □ 

Remark 4.2. If b e C 2 verifies for some constant B, 

(4-4) V s b := {\{diV + <),!>' )), , r d < BI d 

in the order of nonnegative definiteness where Id is the identity matrix, then 
(y — x, b(y) — b(x)) < B\x — y\ 2 and the condition on b in (4.2) is satisfied. 
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Remark 4.3. Assume ||V6|| < K , n = d and cr(x) = a = Id- Capitaine, 
Hsu and Ledoux [3] yields the log-Sobolev inequality below: 



(the Cameron-Martin space). As the result of Otto and Villani [15] sug- 
gests that the log-Sobolev inequality implies the T2(C) inequality (that is 
proved on the smooth Riemannian manifold), we should have S ^(C) on 
C([0,T]) w.r.t. the following pseudo-metric, 



This last pseudo metric is much larger than dx used in the Corollary above. 
We shall give a simple proof of this last T^{C) inequality in Section 5. 

Notice that as du above is only a pseudo-metric and ||X.||h = +oo, a.s., 
Theorem 1.1 cannot be applied for T\(C) associated with du (since its suf- 
ficient part is no longer valid) and Theorem 2.3 (whose proof is based on 
Theorem 1.1) is no longer true w.r.t. du- 

Remark 4.4. Without essential change of proof, the same result holds 
if the locally Lipschitzian condition of a, b is replaced by the well posedness 
of the martingale problem associated with (era*, b), in the sense of Stroock- 
Varadhan. 

Remark 4.5. If the condition on the drift b in (4.2) is substituted by 
(x,b(x)} < B(l + \x\ 2 ) Vx G M, d , then with the same proof as above, we can 
prove that Eexp(<5sup te [ 0i T] |Af(x)| 2 ) < +oo for some 5 > depending on 
initial point. Hence, P x satisfies the T\ -inequality with a constant C = C x 
depending on x. 

Note the following drawback of the previous corollary: the constant C in 
the Ti inequality obtained through Theorem 2.3 via inequality (2.2) is of 
order e@ T which is not natural in regard of the results obtained via weakly 
dependent sequences. We now show how Theorem 2.5 enables us to get the 
correct order. 

We know from Corollary 4.1 that the law of (X t (x)) te [ ^ satisfies the T\- 
inequality with a constant C independent of x. In other words, the transition 
kernel of the Markov chain Y n := -X^n+i] valued in C([0, 1],R ) satisfies 




where DF be the Malliavin gradient and 





if 7i - 72 e H, 
otherwise. 



7i(C). Let us check (CT) below. 
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Given two different initial points x,x, let 
X t : = X t (x)-X t (x), 
a t = a(X t (x)) - a(X t (x)), b t = b(X t (x)) - b(X t (x)). 
By Ito's formula, 

\X t \ 2 = \x- x\ 2 + f\tr{a s al) + 2(X„ b.)) ds + M t , 
Jo 

where (Mt) is a local martingale with Mq = 0, whose quadratic variational 
process is given by 

rt 



[M] t = 4 f (Xs^asaDX^ds. 
Jo 



Let T n := inf{t > 0; \X t \ V [M) t = n}. If there is 5 > such that 
(4.5) 



\ tr[(cr(x) — a(x))(a(x) — a(x)) t ] + (x — x, b(x) — b(x)) 
<-5\x-x\ 2 Vx,xeM. d , 
then 

E|X tAf J 2 < \x- x\ 2 - 25 (\\X s ^ n \ 2 ds. 

Jo 

This entails by Gronwall's inequality and Fatou's lemma, 

(4.6) K\X t {x)-X t {x)\ 2 =K\X t \ 2 <\x-x\ 2 e- 2St Vi>0. 

Moreover, if a is globally Lipchitzian, then by Burkholder-Davis-Gundy's 
inequality and Gronwall's inequality, we obtain easily from the estimate 
above that 

E sup \X s (x) - X s (x)\ 2 <K\x-x\ 2 e- 2St 

t<s<t+l 

for some constant K. Thus, the Markov chain Y n := ^[ njn +i] valued in 
C([0, l],M d ) satisfies (CI') too. Consequently, we obtain by Theorem 2.11, 
the following: 

Proposition 4.6. Assume (4.2), (4.5) and a is globally Lipchitzian. 
Then there is some constant C > such that for any n > 1 and any initial 
point x, the law F x of (X t (x)) t£ \o t n] on C([0, n], M rf ) satisfies the inequality 
T\(C ■ n) w.r.t. the metric 

n-l 

^(71,72) •- X] su p Iti(*) -72 (t)\. 

k=Q k<t<k+l 



TRANSPORTATION COST-INFORMATION INEQUALITIES 21 



Remark 4.7. Let (Pt) be the semigroup of transition probability kernels 
of our diffusion (Xt). Notice that under (4.5), we have (4.6) which entails 
not only the existence and uniqueness of the invariant probability measure 
fj, of (Pt), but also 

W$(P t (x,-),P t (x,-)) <e~ St \x - x\, 
which gives us the exponential convergence below: 

ft \V 2 
W$(P t (x,-),fJL) <e" 5i ( / \x-x\ 2 dii(x) J Vx£R d ,t>0. 

Let us present a Hoeffding type inequality for 

rn 

F( 7 ):= / V( 7 (t))dt, 
Jo 

where satisfies ||V|| Lip ^ c*. For such V, ||-F||Lip ^ a w.r.t. the 

metric given in the proposition above. Hence, by Theorem 1.1, Proposition 
4.6 entails 

F(£[V(X t (x))-EV(X t (x))]dt>r^ <exp(-^j Vr>0. 

5. A direct approach to T^(C) for SDEs via stochastic calculus. 

5.1. T2 -inequality of the Wiener measure w.r.t. the Cameron-Martin met- 
ric. Let us extend the T2-inequality of the Gaussian measure due to Ta- 
lagrand to the Wiener measure P on C([0,T],M. d ), by means of Girsanov 
formula. Given Q P such that iJ(Q/P) < +00, then under Q, there ex- 
ist a Brownian motion (Bt) and a predictable process (fit) such that the 
coordinates system (74) of C([0,T],M d ) verifies 

dit = dB t + pt(i)dt,'yo = 0. 

Moreover, it is well known that [see the proof of (5.7) below in a much more 
complicated case] 

(5.1) ff(Q/P) = ±E« f T WtW^dt. 

Jo 

Consider the Girsanov transformation $(7) := — J ' Pt(l) dt. Then the 
law of (7, $(7)) under Q is a coupling of (Q,P). Hence, w.r.t. the Cameron- 
Martin metric du given in Remark 4.2, 

(5.2) (H/ 2 ^(Q,P)) 2 <E^^(7^(7)) 2 = IE Q C \Ptf(l) dt = 2H(Q/F), 

Jo 
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that is, PeT 2 (l) on (C([0, T], IT), d H ). We see now why this is sharp. In- 
deed, if fit is determinist (or, equivalently, Q is a Gaussian measure), we 
claim that 

[^(Q,P)] 2 = [ T \p t \ 2 dt = 2H(Q/¥). 
Jo 

This follows by the following observation: 

Lemma 5.1. Let X be a random variable valued in a Banach space E 
and H be a separable Hilbert space continuously embedded in E. Then for 
any element h 6 H , 

W^(F x ,F x+h ) = \\h\\ H , 

where Fx is the law of X, dn(x,y) := ||x — y\\n if x — y £ H and +00 
otherwise. 

Proof. At first [W 2 dH (F x ,F x+h )} 2 < E\\X - (X + h)\\ 2 H = \\h\\ 2 H . To 
show the inverse inequality, let i be a probability measure on E 2 such 
that its marginal laws are, respectively, laws of X and X + h, and // \\y — 
x \\'n 7T {dx, dy) < +00. Since y — (x + h) is centered in the sense that ¥^{ei^y — 
(x + h))n = where (e^) is an orthonormal basis of H, we have by Jensen's 
inequality, 

J J \\y-x\\ 2 H ir(dx,dy) = J j\\h+ (y - (x + h))f H ir(dx,dy) >\\hf H , 
the desired result. □ 

Considering the mapping ^(7) = 7(?"), which verifies 

|*(7i) - *(7a)| < VTd H (11,72), 

we get by Lemma 2.1 and (5.2) that J\f(0,TI d ) £ T 2 {C) on R d w.r.t. the 
Euclidean metric with the sharp constant C = T (the theorem of Talagrand). 

Remark 5.2. Gentil [7] proved the dual (functional) version of the 
T2-inequality of the Wiener measure w.r.t. the Cameron-Martin metric by 
generalizing the approach in [1]. The proof here is completely different and 
seems to be simpler and direct. 

Remark 5.3. Recall the method of Talagrand for proving his 72(C) 
for J\f(0,Id)- At first by independent tensorization, he reduces to dimension 
1. And in dimension one, he uses the optimal transportation of Frechet 
putting forward 7 = AA(0, 1) to / dj, and a direct integration by parts yields 
miraculously his T2(C). The method here is completely different, we use the 
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Girsanov transformation which puts Q back to P instead of an (eventual) 
optimal transportation putting P forward to Q. The approach of Talagrand 
is generalized recently by Feyel and Ustunel [6] who succeed to construct the 
optimal transportation from P to Q on an abstract Wiener space (W, H, P). 

We learned very recently (10 monthes after our first version) from Fang 
that the method of Girsanov transformation here has been used by Feyel 
and Ustunel [5] in a less elementary manner. So the result of this paragraph 
is due to them. 

5.2. Ti-inequality of diffusions w.r.t. the Cameron-Martin metric. We 
now generalize the preceding argument to solution of the SDE 

dX t = dB t + b(X t )dt, X = xeR d , 

where (Bt) is a Revalued Brownian motion. We assume that b G C 1 and 

||V6||<K 

For any path 7 G C([0, T], R d ) with 7(0) = 0, let $(7) = rj be the solution of 

r/(i) = x + 7(t) + / b{n(s))ds. 
Jo 

Then the solution of the SDE above is given by X. = Q(B.). Hence, for 
proving the T2-inequality of X. w.r.t. the metric dn, it is enough to show 
that <I> is dj^-Lipschitzian. To this end, consider 

9(t) = ^(l + eh)\ £=0 , 

where h G H is fixed. It satisfies 

g(t) = h(t)+ f t Vb(r l {s))g{s)ds. 
Jo 

Its solution is given by 

g(t)= [ J{s,t)h'{s)ds, 
Jo 

where J(s,t) is the solution of the matrix differential equation 

(5.3) J(s,s) = I d , ^J(s,t) = Vb(r)(t))J(s,t). 

Since V s 6 < BI d for some B <K, we have \ J{s,t)y\ < e B ^~ s) \y\ V y G R d . 
Consequently, 

|<?(i)|< f e B{t - s) \h'{s)\ds. 
Jo 
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Thus, by Cauchy-Schwarz 

rT 



g\\%<2[ \h'{t)\ 2 dt + 2 f \Vb(r)(t))g(t)\ 2 dt 
Jo Jo 



< 2 



2 H + 2K 2 



Tr ci 



j B (*-*)|/i'(s)|ds 



Note that 



z B ^\h'(s)\ds 



T r T 



where 



F(u,v) 



dt= I I \h'(u)\\h'(v)\ 
Jo Jo 

= ( T \h'\,\h'\} L 2Qp tT] ), 



„2BT „2B(uVv) 
-B(u+v) e ~ e 



o 2Bt-(u+v) dt 



uVv 



du dv 



T — uV v, 



2B 



if B = 



and r/(u) := T(u,v)f(v) dv. Let A max (r) be the largest eigenvalue of F 
in L 2 ([0,T]). We have A max (r) < ||r||i, the norm of F in L 1 ([0, T]). It is easy 
to get ||r||i < jp if B < 0, ||r||i < ^ if B > 0, and ||r||i = ^ if B = 0. 
Thus, setting 



(5.4) 



a 2 :=a 2 (T,K,B) 



^2BT 



2 l + X 



2 1 + 



2S 2 
K 2 T 2 



if 5 < 0, 
if 5 > 0, 
if J3 = 0: 



Lip(dn) 



<a. 



we get by the estimates above that \\g\\ 2 H < a 2 !!^!! 2 ^, that is, ||$>| 
Thus, Lemma 2.1 (which remains valid for the pseudo-metric dn) together 
with the T2-inequality for the Wiener measure gives us the following: 

Proposition 5.4. Assume V s b< Bid and ||V£>|| < K, then for every 
initial point x, ¥ x G T2(a 2 ) on C([0,T],M. d ) w.r.t. the metric du, where a 2 
is given by (5.4). 

Remark 5.5. Of course, the estimate of || < l > ||Lip(d H ) a together with 
the log-Sobolev inequality of Gross for the Wiener measure gives us also 



F 2 log 



EpF 2 



dF T < 2a z 



C([0,T],R d ) 



\DF\ 2 H dP x , 



>C([0,T], 

which is better than the Capitaine-Hsu-Ledoux's estimate in Remark 4.3 
when B < 0. 
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It is interesting to investigate whether this proposition and the corre- 
sponding log-Sobolev inequality continue to hold in the case where V s b < 
Bid with B < without condition || V6|| < K. 

5.3. T2-inequality of diffusions w.r.t. the L 2 -metric. Perhaps the most 
elementary metric on C([0,T],M. d ) is the following L 2 [0, T]-metric, 



^2(71,72) ;= y £ \n(t) - 72 W I 2 dt. 

Indeed, the argument leading to the T2-inequality of the Wiener measure 
will yield the following robust T2-inequality w.r.t. the metric above: 

Theorem 5.6. Assume that a, b are locally Lipschitzian and satisfy (4.5) 
for some 6 > 0, and ||<r||oo := sup{|o"(x)2|; x G < 1} < +oo. Then 

W x G T 2 {C) on C([0,T],R d ) w.r.t. the L 2 -metric d 2 above for all x G R d 
and T > 0, where the constant C is given by 

II l|2 
/~i Halloo 

II II 2 

Moreover, Pt(x, •) G T 2 { 2 s ) on ^> as we ^ as ^ e un ^ aue invariant 'prob- 
ability measure \i of (Pt) . 

Remark 5.7. The two T2-inequalities in this theorem are both sharp. 
Indeed, let d = 1, a(x) = 1, b(x) = x/2, that is, (X t ) is the standard real 
Ornstein-Uhlenbeck process, whose invariant measure is jV(0, 1). By this 
proposition, fj, G T 2 (C) with C = ||cj|| 2 <) /2(5 = 1, which is sharp. 

For the sharpness of the T2-inequality for ¥ x w.r.t. d 2 , note that any 
Gaussian measure M(m, S) on M. n satisfies T 2 (C) with the sharp constant 
C being the largest eigenvalue A max (E) of the covariance matrix E. This 
can be extended easily to any Gaussian measure v = jV(m,E) on any sepa- 
rable Hilbert space G, where the covariance matrix E is a Hilbert-Schmidt 
operator on G. Hence, if (X t )t>Q is a Gaussian process with paths a.s. in 
L 2 ([0,T],dt), then its law P satisfies the T 2 (C) on L 2 ([0,T],dt) with the 
sharp constant C = A max (E), the largest eigenvalue of the operator 

E/(s):= [ T Cov(X s ,X t )f(t)dt V/GL 2 ([0,T],dt). 
J o 

For the Ornstein-Uhlenbeck process law Po above starting from 0, Cov(X s ,Xt) 
exp(— \t — s|/2) — exp(— (s + t)/2). In that case, 

A^)>^fW-4 asT^oo. 

Hence, the constant C = ||cr|| 2 /<5 2 = 4 in the T2-inequality for Po given by 
our theorem becomes sharp when T — > +oo. 



26 



H. DJELLOUT, A. GUILLIN AND L. WU 



Proof. We shall prove that for any e > 0, for any probability measure 
QonC([0,T],l 



(5.5) (W^(Q,¥ X )) 2 < 2 (1 f^Pj^ H 
and for any probability measure v on M n , 

SUD e (e — 25)* 11^112 

(5.6) (W^(u,P T (x,-))f<2 Pte[ °' T] Li^fTCiz/PrCa;,.)). 

Choosing e = 5 in (5.5), we get the first claim in the theorem; letting e f 2<5, 

II 1 1 2 II II 2 

we get Pt{x,-) G ^(^^p) by (5.6) and then /i e ^M 11 ^ 22 -) by Lemma 2.2 
and the fact that Pr(x, •) — ► /U as T — > oo (see Remark 4.7). 

It is enough to prove (5.5) for Q < P x . and H(Q/F X ) < +oo. We divide 
its proof into two steps. 

Step 1. We do at first some preparation of stochastic calculus. Let (fi, J 7 , P) 
be a complete probability space on which a n-dimensional Brownian motion 
(B t ) = {Bl)j = \ y ,^ n is defined and let Tt = = cr(B s , s < t) ¥ (completion 
by P). Let Xt(x) be the unique solution of (4.1) starting from x. Then the 
law of X.(x) is ¥ x . Consider 

Q := ^ X -(x)) • P, M t := E*(^(X.(x))/F t } Vt € [0,T]. 

Remark that, as Q is a probability measure and the law of X(x) under P 
is exactly P x , we have 

■ (*(*)) <*P = I „ ^(w)dF x (w) = Q(C([0,T],R d )) = l. 

n dP x Jc{[o,T],R d ) aP x 

(Mt) is a martingale can and will be chosen as a continuous martingale. Let 
r := inf{t 6 [0, T]; Mt = 0} with the convention that inf := T+, where T+ 
is an artificial added element larger than T, but smaller than any a > T. 
Then Q(r = T+) = 1 and 

M t = t t<T exp(Lt-±[L]t), 

where Lt := f Vt < r. being a P-local martingale on [0, r), can 

be represented in the following way: there is a predictable process (/%) = 
(fi\ )o<t<r such that /q \f3 s \ 2 ds < +oo, P-a.s. on [t < r] and 

? s ) Vt<r. 



~{ Jo Jo 
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Let T n = inf{t G [0, r[; [L]t = n} with the same convention that inf := T+. 
It is elementary that r n f r, P-a.s. Hence, by martingale convergence, 

H(Q/F) = H(Q/F) = E f M T log M T = lim E f M TATn log M TATn 

n— >oo 

= JimE«(L TArn -|[L] TAr J. 

By Girsanov's formula, (L tATn — [L]t ATn )te[o,T] is a Q-local martingale, then 
a true martingale since its quadratic variation process under Q, being again 

([L]t/\T n ), is bounded by n. Consequently, E { 2(Ltat„ — [-^]tat„) = 0. Substi- 
tuting it into the preceding equality and noting that Q(r n f r = T+) = 1, 
we get by monotone convergence, 

(5.7) tf(Q/P) = ± ljm E«[L] TArn = |E«[L] T = ±E^ F W dt. 

J 

Notice that this is an extension of (5.1). 

Step 2. By Girsanov's theorem, 

Bf.= B t - r p s ds 
Jo 

is a Q-local martingale with [.B^i?^ = [.B l ,.B-?]t = ti=jt, hence, a Brownian 
motion under Q. Under Q, Xt = Xt(x) verifies 

dX t = a{X t ) dB t + b{X t ) dt + a{X t )Pt dt, X = x. 

We now consider the solution Yt (under Q) of 

dY t = a(Y t )dB t + b(Y t )dt, Y = x. 

The law of {Yt)t^[o,T\ under Q is exactly ¥ x . In other words, (X, Y) under 
Q is a coupling of (Q,P X .). 
Setting 

X t :=X t -Y t , a t :=a{X t )-a(Y t ), b t := b(X t ) - b{Y t ) , 
we have 

(5.8) d\X t \ 2 = [2(X t M + v(X t )/3t)+tT(a t (T t t )}dt + 2(X t ,a t dBt). 
Letting f n := inf{i € [0,T]; \X t \ = n}, we have that for any e > 0, 

E^\X tA rJ 2 < -25 E®\X sAfn \ 2 ds + 2E® (X s ,a(X s )p s ) ds 
Jo Jo 

<{e-28) [ t E®\X sA ?J 2 ds + -E® f Waf^ 2 ds. 

Jo £ Jo 
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Gronwall's lemma, together with Fatou's lemma, gives us 



(5.9) 



E Q \X t \ 2 < 




e (e-2S)(t 



s) \(3 s \ 2 ds Vt>0. 



Thus 



T 



X t \ 2 dt 



<Mco E Q f \f3 s \2 ds [ e {e-25){t-s) dt 



e Jo 




the desired (5.5). For (5.6), notice that by the key remark (2.1), 



H{v/P T {x,-)) =inf{F(Q| c([0jT])Rd) /P a: | c([0jT])Rd) );QT :=Q(x T € -) = v}. 



and conclude using (5.9). □ 

Remark 5.8. After the first version was submitted, we learned from M. 
Ledoux the work of Wang [20] who obtained the 12(C) w.r.t. the L 2 -metric 
for the elliptic diffusions with lower bounded T2 condition of Bakry on a Rie- 
mannian manifold. His method consists of a continuous time tensorization 
of the ?2(C) of the heat kernels (which is true by the log-Sobolev inequal- 
ity due to Bakry). Hence, the method and the result here are very different 
from his: the volatility coefficient a could be completely degenerated in The- 
orem 5.6, and our proof does not rely on the log-Sobolev inequality which 
is unknown in our context. 

Remark 5.9. By the proof above, we see that (5.5) and (5.6) hold 
under (4.5) even with 5 < 0, except now the T2-constant goes to infinity as 
T^+oo. 

Remark 5.10. The local Lipschitzian condition on a,b in this theorem 
can be substituted by their continuity together with the well-posedness of 
the martingale problem associated with (aa f , b). Indeed, one can find (a n ,b n ) 
tending locally uniformly to (cr,b), such that (a n ,b n ) is locally Lipschitzian, 
||c n ||oo < || cr|| 00 and verifies condition (4.5) with the same 5. Now the desired 
result follows from Theorem 5.6 and Lemma 2.2. 



And for each such Q, define Q as before, we have 



[Wi(v,P T (x,dy))] 2 <EQ\X T \ 2 
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As indicated in [1], many interesting consequences can be derived from 
this result. For instance 

Corollary 5.11. Under the assumptions of Theorem 5.6, we have for 
any T > 0, 

(a) for any smooth cylindrical function F on G := L 2 ([0,T],dt;M. d ) D 
C([0,T],R d ), that is, 

F G FCg° := {/« 7 , M,.., (7, K));n > 1, hi G H, f G C b °°(M n )} 

[where (71,72) := Jq T 71(^)72 (i) di], the following Poincare inequality holds: 

II 1 1 2 

(5.10) Var P:c (F) < f ^^(7)^^(7), 

C> JC*([0,T],IR d ) 

where V&rp x (F) is the variance of F under law P x , and V.F( 7 ) G G is the 
gradiant of F at 7. 

(b) For any g£C£°(K. d ), 

(5.11) y av { ){g )<Mk [ \Vg(y)\ 2 P T {x,dy). 

2d jRd 

(c) (Inequality of Tsirel'son type.) For any nonempty subset K in G such 
that Z(j) := sup ftgft: (7, h) G L 1 (F X ), then 



(5.12) f expf^— sup [(7, W^expf^-E^zV 

(d) (Inequality of Hoeffding type.) For any V : R rf — > R suc/i i/iai || V ||Li P < 



i ^ T F(A7(x)) dt - li ^ F(A7(x)) rff > r 
/ Tr 2 ||cj|| 2 \ 

Proof. For part (a), for any F( 7 ) = /(( 7 , /ii), . . . , (7, /i n )) G TC^ ', we 
may assume without loss of generality that hi,. .. ,h n are orthonormal. In 
such case, 

$: 7 ^((7,/ il ),...,( 7 ,/ ln )), G^R n 

is Lipschitzian with ||$||Lip < 1- Hence, v:=F x o G ^2 ( || cr| | ^ /5 2 ) on R n 
by Lemma 2.1. Thus, the result of [1], Section 4.1 entails 

1 1 2 

VarpjF) = Var„(/) < f \V f\ 2 dv 

|VF(7)|| 2 3 dP :c (7). 



5 2 Jc({0,T],R d ) 
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Part (b) is a consequence of Theorem 5.6 by [1], Section 4.1. One can 
derive part (c) from Theorem 5.6 by the same argument as in the finite- 
dimensional case given in [1], Section 6.1. For part (d), note that 12(C) 
Ti(C). Moreover, the function F( 7 ) := (1/T) J Q T V(-y(t)) dt on C([0,T],R d ) 
is Lipschitzian w.r.t. the L 2 -metric and ||-F||Lip < a/y/T. Hence, part (d) 
follows from Theorem 1.1. □ 



Remark 5.12. Let us compare the T 2 (C)-inequality on C([0,T],R d ) 
w.r.t. the L 2 -metric d 2 or the Cameron-Martin metric dn , denoted, respec- 
tively, by T 2 (C/d 2 ), T 2 (C/d H ). 

(a) If 71(0) = 72(0), then £^(71,72) < ^-^(71, 72) by the classical Poincare 
inequality. Hence, if the law ¥ x of our diffusion starting from x verifies 
T 2 (C/d H ) on C([0,T],R d ), then ¥ x G T 2 {C(AT 2 /vr 2 )/^) on C([0, T],R d ). 
That order T 2 in the last T2-inequality is of correct order. For example, 
for the real Wiener measure P, we see by Section 5.1 that P G T 2 (l/du) 
on C([0,T],R d ), but the largest eigenvalue A max (r) of the covariance 
function T(s,t) = s A t in L 2 ([0,T]) verifies 

(n^.i^n) _ t 2 

^max^ 1 J ^ — g • 

Thus, by the same analysis as in Remark 5.7, P G T 2 (CT 2 /d 2 ) with 
4/vr 2 >C = A max (r)>l/3. 

(b) The contribution of | 7 i(i) — 72 (^) | to the L 2 -metric is homogeneous in 
time t, but not at all to the Cameron-Martin metric dfj- This is the 
principal reason for 

(b.l) The T 2 (C/dn) is well adapted to the small time asymptotics of 
the diffusions, but not for their large time asymptotics. For instance, if 
Px G T 2 (C/d H )), since for Z{i) = sup < 4 < T \\^{t) - 7 (0)||, ||^|| L i P (d H ) < 
vf, then by Theorem 1.1 (its necessary part remains true for d}{- 
Lipchitzian function F which is, moreover, integrable, by following the 
proof in [2]), 

F x ( sup \X t (x) -x\ -E x sup \X t (x)-x\ >r) <expf--^— ) 

\0<t<T 0<t<T / \ 2Ui / 

which is of the correct order when T — > 0+ , but completely meaningless 
for T large. See [21] for the nonadaptability of the log-Sobolev inequality 
w.r.t. dn for the large time asymptotics of the diffusions. 

(b.2) In contrary, we have seen that the T 2 (C/d 2 ) is very well adapted 
for the large time asymptotics of the diffusions. 
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Remark 5.13. Theorem 5.6, together with Corollary 3.5, is our main 
new example for which ?2(C) is true but the inequality of log-Sobolev is 
unknown. They are our (very partial) answer to Question 3 in the Introduc- 
tion. We believe that in the situations of Theorem 5.6 and Corollary 3.5, 
the log-Sobolev inequality may fail without further regularity assumptions 
on the volatility coefficient a. 
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